Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NMS MPS device wrapper #9620

Merged
merged 2 commits into from
Sep 27, 2022
Merged

NMS MPS device wrapper #9620

merged 2 commits into from
Sep 27, 2022

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Sep 27, 2022

May resolve #9613

Signed-off-by: Glenn Jocher glenn.jocher@ultralytics.com

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Improvement in MPS (Apple Silicon) support for Non-Maximum Suppression (NMS) in YOLOv5.

📊 Key Changes

  • Added device detection to better handle Apple's Metal Performance Shaders (MPS).
  • Introduced conditional processing to maintain tensor device consistency for MPS.
  • Ensured that output tensors are moved back to the original device (if MPS) after NMS.

🎯 Purpose & Impact

  • 🍏 Enhances compatibility with Apple Silicon by addressing MPS limitations.
  • ⚡ Improves the efficiency of YOLOv5 on MPS-supported devices by ensuring proper tensor device handling.
  • 💡 Allows developers and users with MPS-enabled devices to leverage native hardware acceleration without functionality issues in the NMS part of the inference.

May resolve #9613

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
@markusuntera
Copy link

markusuntera commented Mar 22, 2023

There's still a problem. If some tensor's are empty (no boxes) these device is not set back to mps. Resulting multiple devices.

For example there should be before returning from "non_max_suppression":

    if mps:
        output = [a.to(device) for a in output]
    return output

@glenn-jocher
Copy link
Member Author

@markusuntera thanks for sharing your feedback! It sounds like you are encountering an issue with the MPS device not being set back resulting in multiple devices when some tensors are empty. It seems like adding a conditional check to return the device to MPS if it is enabled before exiting the "non_max_suppression" function could be a viable solution. Have you tested this approach? Let me know if I can assist further!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Training RuntimeError using MPS: Expected all tensors to be on the same device
2 participants